Overview
Brought to you by YData
Dataset statistics
| Number of variables | 12 |
|---|---|
| Number of observations | 2500 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 1000 |
| Duplicate rows (%) | 40.0% |
| Total size in memory | 470.4 KiB |
| Average record size in memory | 192.7 B |
Variable types
| Numeric | 8 |
|---|---|
| Categorical | 3 |
| Boolean | 1 |
| Dataset has 1000 (40.0%) duplicate rows | Duplicates |
Pregnancies has 265 (10.6%) zeros | Zeros |
Reproduction
| Analysis started | 2025-04-03 06:12:03.293279 |
|---|---|
| Analysis finished | 2025-04-03 06:12:07.225860 |
| Duration | 3.93 seconds |
| Software version | ydata-profiling vv4.15.0 |
| Download configuration | config.json |
Variables
Pregnancies
Real number (ℝ)
Zeros 
| Distinct | 10 |
|---|---|
| Distinct (%) | 0.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4.5244 |
| Minimum | 0 |
|---|---|
| Maximum | 9 |
| Zeros | 265 |
| Zeros (%) | 10.6% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 19.7 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 2 |
| median | 5 |
| Q3 | 7 |
| 95-th percentile | 9 |
| Maximum | 9 |
| Range | 9 |
| Interquartile range (IQR) | 5 |
Descriptive statistics
| Standard deviation | 2.9286235 |
|---|---|
| Coefficient of variation (CV) | 0.64729543 |
| Kurtosis | -1.2412125 |
| Mean | 4.5244 |
| Median Absolute Deviation (MAD) | 3 |
| Skewness | -0.0114272 |
| Sum | 11311 |
| Variance | 8.5768354 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) |
| 9 | 292 | |
| 6 | 279 | |
| 1 | 276 | |
| 0 | 265 | |
| 3 | 251 | |
| 4 | 241 | |
| 5 | 238 | |
| 7 | 232 | |
| 8 | 225 | |
| 2 | 201 |
| Value | Count | Frequency (%) |
| 0 | 265 | |
| 1 | 276 | |
| 2 | 201 | |
| 3 | 251 | |
| 4 | 241 | |
| 5 | 238 | |
| 6 | 279 | |
| 7 | 232 | |
| 8 | 225 | |
| 9 | 292 |
| Value | Count | Frequency (%) |
| 9 | 292 | |
| 8 | 225 | |
| 7 | 232 | |
| 6 | 279 | |
| 5 | 238 | |
| 4 | 241 | |
| 3 | 251 | |
| 2 | 201 | |
| 1 | 276 | |
| 0 | 265 |
Glucose
Real number (ℝ)
| Distinct | 694 |
|---|---|
| Distinct (%) | 27.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 134.53924 |
| Minimum | 70.5 |
|---|---|
| Maximum | 199.9 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 19.7 KiB |
Quantile statistics
| Minimum | 70.5 |
|---|---|
| 5-th percentile | 77.295 |
| Q1 | 101.8 |
| median | 134.3 |
| Q3 | 166.8 |
| 95-th percentile | 193.5 |
| Maximum | 199.9 |
| Range | 129.4 |
| Interquartile range (IQR) | 65 |
Descriptive statistics
| Standard deviation | 37.482948 |
|---|---|
| Coefficient of variation (CV) | 0.27860235 |
| Kurtosis | -1.2156746 |
| Mean | 134.53924 |
| Median Absolute Deviation (MAD) | 32.5 |
| Skewness | 0.029814867 |
| Sum | 336348.1 |
| Variance | 1404.9714 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 118.9 | 11 | 0.4% |
| 178.7 | 11 | 0.4% |
| 182.7 | 11 | 0.4% |
| 169.1 | 10 | 0.4% |
| 103.6 | 10 | 0.4% |
| 151.7 | 10 | 0.4% |
| 117.7 | 10 | 0.4% |
| 123.4 | 10 | 0.4% |
| 154 | 9 | 0.4% |
| 83.9 | 9 | 0.4% |
| Other values (684) | 2399 |
| Value | Count | Frequency (%) |
| 70.5 | 3 | 0.1% |
| 70.9 | 2 | 0.1% |
| 71 | 4 | |
| 71.1 | 2 | 0.1% |
| 71.3 | 3 | 0.1% |
| 71.7 | 3 | 0.1% |
| 71.8 | 3 | 0.1% |
| 72.2 | 8 | |
| 72.3 | 4 | |
| 72.4 | 2 | 0.1% |
| Value | Count | Frequency (%) |
| 199.9 | 4 | |
| 199.7 | 2 | 0.1% |
| 199.5 | 2 | 0.1% |
| 199.4 | 2 | 0.1% |
| 198.9 | 4 | |
| 198.8 | 2 | 0.1% |
| 198.6 | 2 | 0.1% |
| 198.5 | 2 | 0.1% |
| 198.4 | 7 | |
| 198.3 | 3 |
BloodPressure
Real number (ℝ)
| Distinct | 489 |
|---|---|
| Distinct (%) | 19.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 89.23704 |
| Minimum | 60 |
|---|---|
| Maximum | 120 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 19.7 KiB |
Quantile statistics
| Minimum | 60 |
|---|---|
| 5-th percentile | 63.1 |
| Q1 | 74.1 |
| median | 88.9 |
| Q3 | 104.5 |
| 95-th percentile | 116.6 |
| Maximum | 120 |
| Range | 60 |
| Interquartile range (IQR) | 30.4 |
Descriptive statistics
| Standard deviation | 17.288059 |
|---|---|
| Coefficient of variation (CV) | 0.19373188 |
| Kurtosis | -1.1950889 |
| Mean | 89.23704 |
| Median Absolute Deviation (MAD) | 15 |
| Skewness | 0.086272563 |
| Sum | 223092.6 |
| Variance | 298.87699 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 79.2 | 18 | 0.7% |
| 78.2 | 17 | 0.7% |
| 71.5 | 17 | 0.7% |
| 64.7 | 16 | 0.6% |
| 90.2 | 15 | 0.6% |
| 71.6 | 15 | 0.6% |
| 77.2 | 15 | 0.6% |
| 94.3 | 14 | 0.6% |
| 83 | 14 | 0.6% |
| 75.9 | 14 | 0.6% |
| Other values (479) | 2345 |
| Value | Count | Frequency (%) |
| 60 | 7 | |
| 60.1 | 2 | 0.1% |
| 60.2 | 6 | |
| 60.4 | 6 | |
| 60.5 | 6 | |
| 60.6 | 6 | |
| 60.8 | 2 | 0.1% |
| 61 | 2 | 0.1% |
| 61.1 | 2 | 0.1% |
| 61.2 | 10 |
| Value | Count | Frequency (%) |
| 120 | 2 | 0.1% |
| 119.9 | 2 | 0.1% |
| 119.8 | 2 | 0.1% |
| 119.7 | 2 | 0.1% |
| 119.6 | 2 | 0.1% |
| 119.5 | 2 | 0.1% |
| 119.4 | 5 | |
| 119.2 | 3 | |
| 119.1 | 6 | |
| 118.8 | 3 |
SkinThickness
Real number (ℝ)
| Distinct | 373 |
|---|---|
| Distinct (%) | 14.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 29.19868 |
| Minimum | 10 |
|---|---|
| Maximum | 50 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 19.7 KiB |
Quantile statistics
| Minimum | 10 |
|---|---|
| 5-th percentile | 11.6 |
| Q1 | 19.4 |
| median | 28.7 |
| Q3 | 38.6 |
| 95-th percentile | 47.5 |
| Maximum | 50 |
| Range | 40 |
| Interquartile range (IQR) | 19.2 |
Descriptive statistics
| Standard deviation | 11.335596 |
|---|---|
| Coefficient of variation (CV) | 0.3882229 |
| Kurtosis | -1.1396867 |
| Mean | 29.19868 |
| Median Absolute Deviation (MAD) | 9.6 |
| Skewness | 0.060871458 |
| Sum | 72996.7 |
| Variance | 128.49574 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 33.6 | 19 | 0.8% |
| 20.3 | 19 | 0.8% |
| 34.2 | 18 | 0.7% |
| 42.1 | 18 | 0.7% |
| 25.9 | 17 | 0.7% |
| 18.4 | 16 | 0.6% |
| 37.7 | 16 | 0.6% |
| 11.2 | 15 | 0.6% |
| 17.9 | 15 | 0.6% |
| 39.2 | 15 | 0.6% |
| Other values (363) | 2332 |
| Value | Count | Frequency (%) |
| 10 | 7 | |
| 10.1 | 11 | |
| 10.2 | 2 | 0.1% |
| 10.3 | 6 | |
| 10.4 | 4 | 0.2% |
| 10.5 | 13 | |
| 10.6 | 9 | |
| 10.7 | 3 | 0.1% |
| 10.8 | 2 | 0.1% |
| 10.9 | 10 |
| Value | Count | Frequency (%) |
| 50 | 3 | 0.1% |
| 49.9 | 6 | |
| 49.8 | 5 | |
| 49.7 | 2 | 0.1% |
| 49.6 | 6 | |
| 49.5 | 4 | |
| 49.4 | 4 | |
| 49.3 | 4 | |
| 49.2 | 8 | |
| 49.1 | 3 | 0.1% |
Insulin
Real number (ℝ)
| Distinct | 850 |
|---|---|
| Distinct (%) | 34.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 157.30172 |
| Minimum | 15 |
|---|---|
| Maximum | 299.6 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 19.7 KiB |
Quantile statistics
| Minimum | 15 |
|---|---|
| 5-th percentile | 31.595 |
| Q1 | 92.05 |
| median | 158.4 |
| Q3 | 222.85 |
| 95-th percentile | 283.7 |
| Maximum | 299.6 |
| Range | 284.6 |
| Interquartile range (IQR) | 130.8 |
Descriptive statistics
| Standard deviation | 79.652502 |
|---|---|
| Coefficient of variation (CV) | 0.50636765 |
| Kurtosis | -1.1179238 |
| Mean | 157.30172 |
| Median Absolute Deviation (MAD) | 65.7 |
| Skewness | 0.0055982029 |
| Sum | 393254.3 |
| Variance | 6344.521 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 86.9 | 10 | 0.4% |
| 211.6 | 9 | 0.4% |
| 223.7 | 9 | 0.4% |
| 92.7 | 8 | 0.3% |
| 179.8 | 8 | 0.3% |
| 141.2 | 8 | 0.3% |
| 78.1 | 8 | 0.3% |
| 38.9 | 8 | 0.3% |
| 293.2 | 7 | 0.3% |
| 77.7 | 7 | 0.3% |
| Other values (840) | 2418 |
| Value | Count | Frequency (%) |
| 15 | 2 | 0.1% |
| 15.1 | 3 | |
| 15.3 | 3 | |
| 15.9 | 2 | 0.1% |
| 16.2 | 2 | 0.1% |
| 16.5 | 2 | 0.1% |
| 16.6 | 2 | 0.1% |
| 17.3 | 2 | 0.1% |
| 17.6 | 6 | |
| 17.7 | 3 |
| Value | Count | Frequency (%) |
| 299.6 | 5 | |
| 299.1 | 2 | 0.1% |
| 298.9 | 2 | 0.1% |
| 298.7 | 4 | |
| 298.4 | 2 | 0.1% |
| 298.1 | 2 | 0.1% |
| 297.9 | 3 | |
| 297.7 | 2 | 0.1% |
| 296.1 | 3 | |
| 296 | 2 | 0.1% |
BMI
Real number (ℝ)
| Distinct | 268 |
|---|---|
| Distinct (%) | 10.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 31.3718 |
| Minimum | 18 |
|---|---|
| Maximum | 45 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 19.7 KiB |
Quantile statistics
| Minimum | 18 |
|---|---|
| 5-th percentile | 19.1 |
| Q1 | 24.6 |
| median | 31.3 |
| Q3 | 38.1 |
| 95-th percentile | 43.5 |
| Maximum | 45 |
| Range | 27 |
| Interquartile range (IQR) | 13.5 |
Descriptive statistics
| Standard deviation | 7.7772033 |
|---|---|
| Coefficient of variation (CV) | 0.24790427 |
| Kurtosis | -1.186453 |
| Mean | 31.3718 |
| Median Absolute Deviation (MAD) | 6.8 |
| Skewness | 0.012422574 |
| Sum | 78429.5 |
| Variance | 60.484891 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 42.1 | 25 | 1.0% |
| 43.3 | 23 | 0.9% |
| 30.7 | 22 | 0.9% |
| 20.9 | 21 | 0.8% |
| 31.6 | 21 | 0.8% |
| 31.3 | 21 | 0.8% |
| 18.5 | 20 | 0.8% |
| 37.4 | 19 | 0.8% |
| 39.1 | 18 | 0.7% |
| 33.1 | 18 | 0.7% |
| Other values (258) | 2292 |
| Value | Count | Frequency (%) |
| 18 | 5 | 0.2% |
| 18.1 | 14 | |
| 18.2 | 12 | |
| 18.3 | 7 | 0.3% |
| 18.4 | 11 | |
| 18.5 | 20 | |
| 18.6 | 8 | 0.3% |
| 18.8 | 13 | |
| 18.9 | 18 | |
| 19 | 8 | 0.3% |
| Value | Count | Frequency (%) |
| 45 | 2 | 0.1% |
| 44.9 | 13 | |
| 44.8 | 9 | |
| 44.7 | 2 | 0.1% |
| 44.6 | 12 | |
| 44.5 | 9 | |
| 44.4 | 16 | |
| 44.3 | 11 | |
| 44.2 | 5 | 0.2% |
| 44.1 | 10 |
DiabetesPedigreeFunction
Real number (ℝ)
| Distinct | 231 |
|---|---|
| Distinct (%) | 9.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.324988 |
| Minimum | 0.11 |
|---|---|
| Maximum | 2.5 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 19.7 KiB |
Quantile statistics
| Minimum | 0.11 |
|---|---|
| 5-th percentile | 0.21 |
| Q1 | 0.75 |
| median | 1.32 |
| Q3 | 1.92 |
| 95-th percentile | 2.36 |
| Maximum | 2.5 |
| Range | 2.39 |
| Interquartile range (IQR) | 1.17 |
Descriptive statistics
| Standard deviation | 0.68695601 |
|---|---|
| Coefficient of variation (CV) | 0.51846206 |
| Kurtosis | -1.1881789 |
| Mean | 1.324988 |
| Median Absolute Deviation (MAD) | 0.59 |
| Skewness | -0.053062855 |
| Sum | 3312.47 |
| Variance | 0.47190856 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 2.2 | 29 | 1.2% |
| 1.14 | 27 | 1.1% |
| 2.39 | 23 | 0.9% |
| 0.97 | 23 | 0.9% |
| 2.15 | 23 | 0.9% |
| 2.1 | 22 | 0.9% |
| 1.36 | 22 | 0.9% |
| 1.96 | 22 | 0.9% |
| 0.65 | 21 | 0.8% |
| 0.99 | 21 | 0.8% |
| Other values (221) | 2267 |
| Value | Count | Frequency (%) |
| 0.11 | 12 | |
| 0.12 | 10 | |
| 0.13 | 11 | |
| 0.14 | 9 | |
| 0.15 | 11 | |
| 0.16 | 18 | |
| 0.17 | 9 | |
| 0.18 | 5 | 0.2% |
| 0.19 | 14 | |
| 0.2 | 11 |
| Value | Count | Frequency (%) |
| 2.5 | 4 | 0.2% |
| 2.49 | 2 | 0.1% |
| 2.48 | 8 | |
| 2.47 | 10 | |
| 2.46 | 5 | |
| 2.45 | 3 | 0.1% |
| 2.44 | 8 | |
| 2.43 | 8 | |
| 2.42 | 8 | |
| 2.41 | 2 | 0.1% |
Age
Real number (ℝ)
| Distinct | 59 |
|---|---|
| Distinct (%) | 2.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 50.272 |
| Minimum | 21 |
|---|---|
| Maximum | 79 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 19.7 KiB |
Quantile statistics
| Minimum | 21 |
|---|---|
| 5-th percentile | 24 |
| Q1 | 36.75 |
| median | 50 |
| Q3 | 65 |
| 95-th percentile | 76 |
| Maximum | 79 |
| Range | 58 |
| Interquartile range (IQR) | 28.25 |
Descriptive statistics
| Standard deviation | 16.638893 |
|---|---|
| Coefficient of variation (CV) | 0.33097734 |
| Kurtosis | -1.1339881 |
| Mean | 50.272 |
| Median Absolute Deviation (MAD) | 14 |
| Skewness | 0.00064820544 |
| Sum | 125680 |
| Variance | 276.85276 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 24 | 69 | 2.8% |
| 59 | 62 | 2.5% |
| 47 | 61 | 2.4% |
| 68 | 60 | 2.4% |
| 72 | 60 | 2.4% |
| 78 | 59 | 2.4% |
| 43 | 58 | 2.3% |
| 33 | 57 | 2.3% |
| 51 | 57 | 2.3% |
| 49 | 57 | 2.3% |
| Other values (49) | 1900 |
| Value | Count | Frequency (%) |
| 21 | 41 | |
| 22 | 27 | 1.1% |
| 23 | 27 | 1.1% |
| 24 | 69 | |
| 25 | 31 | |
| 26 | 34 | |
| 27 | 38 | |
| 28 | 34 | |
| 29 | 48 | |
| 30 | 50 |
| Value | Count | Frequency (%) |
| 79 | 31 | |
| 78 | 59 | |
| 77 | 33 | |
| 76 | 41 | |
| 75 | 46 | |
| 74 | 32 | |
| 73 | 51 | |
| 72 | 60 | |
| 71 | 25 | |
| 70 | 33 |
BMI_Category
Categorical
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 139.5 KiB |
| Normal | |
|---|---|
| Underweight | |
| Overweight | |
| Obese |
Length
| Max length | 11 |
|---|---|
| Median length | 10 |
| Mean length | 8.0836 |
| Min length | 5 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Overweight |
|---|---|
| 2nd row | Underweight |
| 3rd row | Overweight |
| 4th row | Underweight |
| 5th row | Overweight |
Common Values
| Value | Count | Frequency (%) |
| Normal | 657 | |
| Underweight | 642 | |
| Overweight | 640 | |
| Obese | 561 |
Length
Histogram of lengths of the category
Common Values (Plot)
| Value | Count | Frequency (%) |
| normal | 657 | |
| underweight | 642 | |
| overweight | 640 | |
| obese | 561 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 3686 | |
| r | 1939 | 9.6% |
| w | 1282 | 6.3% |
| t | 1282 | 6.3% |
| h | 1282 | 6.3% |
| g | 1282 | 6.3% |
| i | 1282 | 6.3% |
| O | 1201 | 5.9% |
| N | 657 | 3.3% |
| o | 657 | 3.3% |
| Other values (9) | 5659 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 20209 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| e | 3686 | |
| r | 1939 | 9.6% |
| w | 1282 | 6.3% |
| t | 1282 | 6.3% |
| h | 1282 | 6.3% |
| g | 1282 | 6.3% |
| i | 1282 | 6.3% |
| O | 1201 | 5.9% |
| N | 657 | 3.3% |
| o | 657 | 3.3% |
| Other values (9) | 5659 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 20209 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| e | 3686 | |
| r | 1939 | 9.6% |
| w | 1282 | 6.3% |
| t | 1282 | 6.3% |
| h | 1282 | 6.3% |
| g | 1282 | 6.3% |
| i | 1282 | 6.3% |
| O | 1201 | 5.9% |
| N | 657 | 3.3% |
| o | 657 | 3.3% |
| Other values (9) | 5659 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 20209 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| e | 3686 | |
| r | 1939 | 9.6% |
| w | 1282 | 6.3% |
| t | 1282 | 6.3% |
| h | 1282 | 6.3% |
| g | 1282 | 6.3% |
| i | 1282 | 6.3% |
| O | 1201 | 5.9% |
| N | 657 | 3.3% |
| o | 657 | 3.3% |
| Other values (9) | 5659 |
| Value | Count | Frequency (%) |
| True | 1296 | |
| False | 1204 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 1 |
| 3rd row | 1 |
| 4th row | 0 |
| 5th row | 1 |
Common Values
| Value | Count | Frequency (%) |
| 1 | 1310 | |
| 0 | 1190 |
Length
Histogram of lengths of the category
Common Values (Plot)
| Value | Count | Frequency (%) |
| 1 | 1310 | |
| 0 | 1190 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 1310 | |
| 0 | 1190 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 2500 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 1 | 1310 | |
| 0 | 1190 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 2500 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 1 | 1310 | |
| 0 | 1190 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 2500 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 1 | 1310 | |
| 0 | 1190 |
Notes
Categorical
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 152.8 KiB |
| Severe symptoms | |
|---|---|
| Under treatment | |
| Mild symptoms | |
| No symptoms |
Length
| Max length | 15 |
|---|---|
| Median length | 15 |
| Mean length | 13.5408 |
| Min length | 11 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | No symptoms |
|---|---|
| 2nd row | Severe symptoms |
| 3rd row | Mild symptoms |
| 4th row | No symptoms |
| 5th row | Mild symptoms |
Common Values
| Value | Count | Frequency (%) |
| Severe symptoms | 646 | |
| Under treatment | 629 | |
| Mild symptoms | 626 | |
| No symptoms | 599 |
Length
Histogram of lengths of the category
Common Values (Plot)
| Value | Count | Frequency (%) |
| symptoms | 1871 | |
| severe | 646 | 12.9% |
| under | 629 | 12.6% |
| treatment | 629 | 12.6% |
| mild | 626 | 12.5% |
| no | 599 | 12.0% |
Most occurring characters
| Value | Count | Frequency (%) |
| m | 4371 | |
| e | 3825 | |
| t | 3758 | |
| s | 3742 | |
| 2500 | 7.4% | |
| o | 2470 | 7.3% |
| r | 1904 | 5.6% |
| y | 1871 | 5.5% |
| p | 1871 | 5.5% |
| n | 1258 | 3.7% |
| Other values (9) | 6282 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 33852 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| m | 4371 | |
| e | 3825 | |
| t | 3758 | |
| s | 3742 | |
| 2500 | 7.4% | |
| o | 2470 | 7.3% |
| r | 1904 | 5.6% |
| y | 1871 | 5.5% |
| p | 1871 | 5.5% |
| n | 1258 | 3.7% |
| Other values (9) | 6282 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 33852 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| m | 4371 | |
| e | 3825 | |
| t | 3758 | |
| s | 3742 | |
| 2500 | 7.4% | |
| o | 2470 | 7.3% |
| r | 1904 | 5.6% |
| y | 1871 | 5.5% |
| p | 1871 | 5.5% |
| n | 1258 | 3.7% |
| Other values (9) | 6282 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 33852 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| m | 4371 | |
| e | 3825 | |
| t | 3758 | |
| s | 3742 | |
| 2500 | 7.4% | |
| o | 2470 | 7.3% |
| r | 1904 | 5.6% |
| y | 1871 | 5.5% |
| p | 1871 | 5.5% |
| n | 1258 | 3.7% |
| Other values (9) | 6282 |
Interactions
Correlations
| Age | BMI | BMI_Category | BloodPressure | DiabetesPedigreeFunction | Glucose | Insulin | Notes | Outcome | Pregnancies | SkinThickness | Smoker | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Age | 1.000 | 0.036 | 0.082 | 0.015 | -0.044 | 0.046 | -0.041 | 0.056 | 0.063 | -0.014 | 0.008 | 0.056 |
| BMI | 0.036 | 1.000 | 0.084 | -0.017 | -0.017 | 0.004 | 0.006 | 0.093 | 0.073 | 0.047 | -0.021 | 0.048 |
| BMI_Category | 0.082 | 0.084 | 1.000 | 0.069 | 0.106 | 0.069 | 0.098 | 0.084 | 0.030 | 0.084 | 0.080 | 0.000 |
| BloodPressure | 0.015 | -0.017 | 0.069 | 1.000 | 0.072 | -0.019 | 0.029 | 0.069 | 0.106 | 0.037 | 0.055 | 0.050 |
| DiabetesPedigreeFunction | -0.044 | -0.017 | 0.106 | 0.072 | 1.000 | -0.037 | -0.042 | 0.096 | 0.087 | -0.053 | 0.003 | 0.037 |
| Glucose | 0.046 | 0.004 | 0.069 | -0.019 | -0.037 | 1.000 | 0.003 | 0.061 | 0.040 | -0.009 | -0.005 | 0.010 |
| Insulin | -0.041 | 0.006 | 0.098 | 0.029 | -0.042 | 0.003 | 1.000 | 0.080 | 0.125 | 0.028 | -0.043 | 0.032 |
| Notes | 0.056 | 0.093 | 0.084 | 0.069 | 0.096 | 0.061 | 0.080 | 1.000 | 0.048 | 0.078 | 0.077 | 0.000 |
| Outcome | 0.063 | 0.073 | 0.030 | 0.106 | 0.087 | 0.040 | 0.125 | 0.048 | 1.000 | 0.067 | 0.069 | 0.000 |
| Pregnancies | -0.014 | 0.047 | 0.084 | 0.037 | -0.053 | -0.009 | 0.028 | 0.078 | 0.067 | 1.000 | -0.009 | 0.053 |
| SkinThickness | 0.008 | -0.021 | 0.080 | 0.055 | 0.003 | -0.005 | -0.043 | 0.077 | 0.069 | -0.009 | 1.000 | 0.060 |
| Smoker | 0.056 | 0.048 | 0.000 | 0.050 | 0.037 | 0.010 | 0.032 | 0.000 | 0.000 | 0.053 | 0.060 | 1.000 |
Missing values
A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
Sample
| Pregnancies | Glucose | BloodPressure | SkinThickness | Insulin | BMI | DiabetesPedigreeFunction | Age | BMI_Category | Smoker | Outcome | Notes | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 5 | 107.7 | 110.5 | 31.5 | 207.0 | 30.1 | 1.85 | 49 | Overweight | False | 0 | No symptoms |
| 1 | 1 | 101.0 | 119.8 | 27.5 | 47.8 | 34.9 | 2.39 | 30 | Underweight | True | 1 | Severe symptoms |
| 2 | 3 | 97.8 | 85.8 | 20.7 | 213.0 | 22.0 | 2.10 | 52 | Overweight | False | 1 | Mild symptoms |
| 3 | 6 | 123.4 | 79.2 | 34.2 | 140.8 | 23.1 | 2.27 | 43 | Underweight | False | 0 | No symptoms |
| 4 | 7 | 161.5 | 114.4 | 44.9 | 284.3 | 41.2 | 2.05 | 59 | Overweight | False | 1 | Mild symptoms |
| 5 | 2 | 98.3 | 71.5 | 50.0 | 55.4 | 24.9 | 1.20 | 29 | Obese | False | 1 | Severe symptoms |
| 6 | 8 | 117.7 | 96.9 | 34.3 | 198.2 | 28.3 | 0.70 | 57 | Normal | False | 1 | No symptoms |
| 7 | 1 | 120.8 | 84.3 | 26.2 | 123.9 | 41.5 | 0.99 | 72 | Underweight | False | 0 | No symptoms |
| 8 | 7 | 196.9 | 62.5 | 19.0 | 287.7 | 35.9 | 0.15 | 60 | Normal | False | 0 | Severe symptoms |
| 9 | 7 | 194.1 | 80.8 | 20.3 | 109.0 | 27.2 | 1.36 | 34 | Obese | True | 1 | No symptoms |
| Pregnancies | Glucose | BloodPressure | SkinThickness | Insulin | BMI | DiabetesPedigreeFunction | Age | BMI_Category | Smoker | Outcome | Notes | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2490 | 9 | 197.6 | 71.5 | 36.3 | 99.5 | 30.7 | 1.72 | 62 | Underweight | True | 1 | Mild symptoms |
| 2491 | 0 | 139.9 | 106.3 | 39.3 | 198.5 | 43.7 | 1.20 | 41 | Normal | True | 1 | Severe symptoms |
| 2492 | 0 | 188.5 | 81.1 | 22.5 | 39.8 | 26.0 | 0.27 | 55 | Underweight | False | 0 | Mild symptoms |
| 2493 | 2 | 86.4 | 76.8 | 35.6 | 15.3 | 30.3 | 2.24 | 47 | Obese | False | 0 | Severe symptoms |
| 2494 | 2 | 81.4 | 81.7 | 11.3 | 224.8 | 21.0 | 1.48 | 51 | Underweight | False | 0 | Under treatment |
| 2495 | 2 | 97.5 | 93.0 | 33.3 | 205.4 | 18.5 | 0.93 | 28 | Normal | False | 1 | Severe symptoms |
| 2496 | 5 | 77.3 | 75.7 | 27.4 | 92.2 | 31.5 | 1.27 | 27 | Obese | False | 1 | Severe symptoms |
| 2497 | 4 | 182.7 | 71.6 | 36.2 | 42.0 | 19.6 | 1.18 | 60 | Underweight | True | 0 | Severe symptoms |
| 2498 | 9 | 113.4 | 89.9 | 23.3 | 158.9 | 38.3 | 0.90 | 69 | Overweight | True | 1 | Mild symptoms |
| 2499 | 2 | 117.8 | 77.9 | 23.5 | 223.6 | 24.4 | 0.44 | 49 | Underweight | False | 1 | No symptoms |
Duplicate rows
Most frequently occurring
| Pregnancies | Glucose | BloodPressure | SkinThickness | Insulin | BMI | DiabetesPedigreeFunction | Age | BMI_Category | Smoker | Outcome | Notes | # duplicates | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 249 | 2 | 101.8 | 62.2 | 25.9 | 45.2 | 20.8 | 0.94 | 70 | Overweight | True | 0 | Severe symptoms | 6 |
| 24 | 0 | 109.2 | 84.9 | 48.2 | 255.9 | 31.5 | 0.19 | 65 | Normal | False | 1 | Under treatment | 5 |
| 99 | 0 | 195.5 | 83.9 | 27.2 | 55.1 | 18.4 | 2.07 | 75 | Normal | True | 1 | Severe symptoms | 5 |
| 228 | 2 | 79.5 | 60.0 | 20.3 | 179.8 | 35.4 | 2.20 | 52 | Normal | True | 1 | No symptoms | 5 |
| 229 | 2 | 81.4 | 81.7 | 11.3 | 224.8 | 21.0 | 1.48 | 51 | Underweight | False | 0 | Under treatment | 5 |
| 340 | 3 | 118.5 | 116.3 | 21.8 | 256.3 | 32.6 | 1.13 | 52 | Obese | False | 0 | Severe symptoms | 5 |
| 380 | 3 | 174.0 | 75.9 | 29.2 | 289.2 | 36.5 | 1.31 | 24 | Overweight | False | 0 | Under treatment | 5 |
| 430 | 4 | 107.6 | 92.8 | 30.6 | 137.4 | 30.0 | 1.88 | 59 | Underweight | True | 0 | Mild symptoms | 5 |
| 484 | 4 | 182.7 | 71.6 | 36.2 | 42.0 | 19.6 | 1.18 | 60 | Underweight | True | 0 | Severe symptoms | 5 |
| 642 | 6 | 122.3 | 68.1 | 26.3 | 141.2 | 22.6 | 1.04 | 35 | Overweight | False | 0 | No symptoms | 5 |